Initialize Project

# Uncomment lines below if rmd file is placed in a subdirectory
# library(knitr)
# opts_knit$set(root.dir = normalizePath('../')) 

Explore

No of app received by # week in year, colored by month

## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
## Warning: Ignoring unknown parameters: binwidth, bins, pad
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`

No of app rcvd by # of week in year, colored by source code

## Warning: Ignoring unknown parameters: binwidth, bins, pad
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`

Recency by week, number of last submit week and number of the week before last submit (current week = 51)

Some EDA on recency

## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`
## Warning: Removed 592 rows containing non-finite values (stat_bin).

Summary by day of week (Mon = 1 .. Sun = 7)

Summary by weekly

EDA on time, submit by day of week and by week

## Warning: Removed 292 rows containing non-finite values (stat_bin).
## Warning: Removed 10 rows containing missing values (geom_path).

## Saving 7 x 5 in image
## Warning: Removed 292 rows containing non-finite values (stat_bin).

## Warning: Removed 10 rows containing missing values (geom_path).
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## Saving 7 x 5 in image
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

product, bundle preference

EDA on Contribution by Products / Bundle

## Saving 7 x 5 in image
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## Saving 7 x 5 in image
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

combine all featues

## Joining, by = "agent_code"
## Joining, by = "agent_code"
## Joining, by = "agent_code"
## Joining, by = "agent_code"

Clustering I

Hierarchical

## clust_group
##    1    2    3    4    5 
## 1517  481  722  175  155

EDA clustering efficiency

Mapping agent cluster with advisor performance group

## 
##    1    2    3    4    5 
## 1517  481  722  175  155
## 
##    1    2    3    4    5 
## 1496  481  708  161  155

Analysis I

Composition of sales type (cluster) for high performance advisor

##            1        2        3        4         5
## A   40.37267 14.90683 31.67702 5.590062 7.4534161
## A+  39.76510 28.18792 19.46309 4.865772 7.7181208
## B   48.17391 15.30435 22.78261 5.739130 8.0000000
## C   55.34351 12.97710 23.28244 2.671756 5.7251908
## D   55.61139 11.72529 24.45561 7.370184 0.8375209
## DM  61.79775 11.23596 20.67416 3.820225 2.4719101
## Pro 18.10345 19.82759 58.62069 1.724138 1.7241379

As mosaic plot shown the different contribution sales type in each advisor group A+ .. B : Have ‘type 5 - high , switching’ more than other group. A+ : Have ‘type 2 - moderate, continue’. Pro : Mostly contributed by ‘type 3 - New active’ since mostly is the new sales, mostly bundle DM : Interesting contributed most the ‘type 1 - little, dormant’

Clustering II

K-Means clustering

EDA clustering efficiency